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Summary 

An implicit code for computing inviscid and viscous incompressible flows on unstructured grids is described. The foundation of the 
code is a backward Euler time discretization for which the linear system is approximately solved at each time step with either a point 
implicit method or a preconditioned Generalized Minimal Residual (GMRES) technique. For the GMRES calculations, several tech- 
niques are investigated for forming the matrix-vector product. Convergence acceleration is achieved through a multigrid scheme that 
uses non-nested coarse grids that are generated using a technique described in the present paper. Convergence characteristics are 
investigated and results are compared with an exact solution for the inviscid flow over a four-element airfoil. Viscous results, which 
are compared with experimental data, include the turbulent flow over a NACA 4412 airfoil, a three-element airfoil for which Mach 
number effects are investigated, and three-dimensional flow over a wing with a partial-span flap. 


Introduction 

In the past decade, much progress has been made in devel- 
oping computational techniques for predicting flow fields 
about complex configurations. These techniques include both 
structured- and unstructured-grid algorithms. The accuracy 
and efficiency of these codes is now such that computational 
fluid dynamics (CFD) is routinely used in the analysis and 
improvement process of existing designs and is a valuable 
tool in experimental programs and in the design of new con- 
figurations. 

Many existing codes referred to above have been devel- 
oped in support of the aircraft industry and, therefore, solve 
the compressible flow equations because of the need to han- 
dle the important effects associated with transonic Mach 
numbers. However, many important problems, such as those 
in the automobile industry and in biomechanics, are inher- 
ently incompressible and must be treated appropriately. 

With the success and wide availability in recent years of 
the compressible codes, these codes have naturally been con- 
sidered for use with incompressible flows by simply lowering 
the Mach number to minimize compressibility effects. Unfor- 
tunately, as the Mach number is successively decreased to- 
ward zero, the performance of the compressible codes in 
terms of both convergence rate and accuracy suffers greatly. 
In Ref. 52, Volpe demonstrated the poor performance of com- 
pressible flow codes under these conditions, particularly for 
Mach numbers below approximately 0.1. 

To overcome the difficulties associated with use of com- 
pressible codes, excellent progress has been made in the use 
of local preconditioners to extend the applicability of these 
codes to low Mach numbers. Several examples of this tech- 
nique, as well as the necessary theory, can be found in 
Refs. 13, 18, 24, 47, 48, 49, and 54. Preconditioning is indeed 
a viable means of extending the applicability of compressible 
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flow codes to the low-Mach-number range and continues to 
be an area of active research. A computer code that utilizes 
this technique has the added benefit of being able to handle 
both compressible and incompressible flows. This technique 
has been applied to both steady-state and time-dependent 
flows; for time-dependent flows, however, a subiterative pro- 
cess is required to maintain time accuracy. 

Another available method of extending a compressible 
flow code for use at zero Mach number is the method of arti- 
ficial compressibility first introduced by Chorin. 14 In this ap- 
proach, a pseudo-time derivative of pressure is added to the 
continuity equation, which allows the continuity equation to 
be advanced in a time-marching manner, much the same as 
the momentum equations. Artificial compressibility has been 
successfully applied by several researchers for both steady- 
state and time-dependent flows (e.g.. Refs. 12, 21, 32, 35, 36, 
43, and 44). When the temperature field is not required, this 
technique offers an advantage over preconditioning methods 
in that the energy equation is not solved; therefore, the effi- 
ciency of the algorithm is enhanced both in terms of computer 
time and reduced memory. The reduction in memory is par- 
ticularly significant for implicit codes on unstructured grids 
because the storage associated with the time linearization of 
the fluxes is reduced by the square of the local block size, or 
roughly 40 percent. As with preconditioning, the use of this 
method for time-dependent flows requires a subiterative pro- 
cedure to obtain a divergence-free velocity field at each time 
step. 

Recently, unstructured grids have been explored for CFD 
problems (e.g., Refs. 1, 5, 6, 10, 26, and 28). This approach 
has several advantages over structured grids for problems that 
involve complex geometries and flows. The biggest advan- 
tage is the reduction in time needed to generate grids within a 
complex computational domain. Another advantage is that 
unstructured grids lend themselves to adaptive-grid methods 
because new nodes can be added to a localized region of the 
mesh by modifying a small subset of the overall grid data 
structure. Although the unstructured-grid approach enjoys 
these advantages over structured grids, flow solvers that uti- 
lize it suffer from several disadvantages. These primarily in- 
clude a factor of 2-3 increase in memory requirements and 
computer run times on a per grid point basis. 

The purpose of the current work is to extend the unstruc- 
tured-grid compressible flow code described in Refs. 1. 2, 
and 10 to incompressible flows. The extension provides a 
code that can be used in the design of airplanes, ships, auto- 
mobiles, pumps, ducts, and turbomachinery. The extension 
also provides a tool for studying Mach-number effects on 
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high-lift airfoils because of the existence of both incompress- 
ible and compressible flow codes with similar levels of nu- 
merical accuracy that can be run on identical grids. The cur- 
rent code is a node-based upwind implicit code that uses 
multigrid acceleration (in two dimensions) to reduce the com- 
puter time required for steady-state computations. In this ex- 
tension, the artificial compressibility approach is used. The 
choice of this technique over preconditioning is based prima- 
rily on the desire to reduce memory requirements and com- 
puter time by reducing the number of equations. 

In the remainder of the paper, the governing equations are 
given, and the basic solution algorithm is described. Although 
both two- and three-dimensional results are shown in the pa- 
per, the description of the equations, algorithms, and bound- 
ary conditions are limited to two-dimensional flow to con- 
serve space. Results are presented to demonstrate the 
incompressible code. Inviscid flow results for a four-element 
airfoil are compared with an exact solution and are used for 
examining the effects of various parameters on the conver- 
gence behavior. Viscous, turbulent flow results for the NACA 
4412 airfoil are compared with experimental data, as well as 
with results from a well-known structured-grid compressible 
code run at a low Mach number. In addition, results are pre- 
sented for a three-element airfoil to study the effects of com- 
pressibility. These results are compared with results from an 
unstructured-grid compressible code and with experimental 
data. Finally, three-dimensional turbulent computations are 
shown for a wing with a partial-span flap. 

Governing Equations 

The governing equations are the incompressible Navier- 
Stokes equations augmented with artificial compressibility. 
These equations represent a system of conservation laws for a 
control volume that relates the rate of change of a vector of 
average state variables q to the flux through the volume sur- 
face. The equations are written in integral form as 


V^+jf r ndl-^f v ndl = 0 ( 1 ) 

dQ. 

where h is the outward-pointing unit normal to the control 
volume V . The vector of dependent state variables q and the 
inviscid and viscous fluxes normal to the control volume /, 
and /,, are given as 
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where (3 is the artificial compressibility parameter; u and t' 
are the Cartesian velocity components in the x and y direc- 
tions, respectively; 0 is the velocity normal to the surface of 
the control volume, where 

0 = n x it + n y v (5) 


and p is the pressure. The shear stresses in Eq. (4) are given 
as 
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X yy = (P + P,)— (6) 

Tty = (H + + V*) 

where p and p, are the laminar and turbulent viscosities, re- 
spectively, and Re is the Reynolds number. 

Solution Algorithm 

The baseline flow solver is an implicit upwind algorithm in 
which the inviscid fluxes are obtained on the faces of each 
control volume with a flux-difference-splitting scheme. For 
the current algorithm, a node-based scheme is used in which 
the variables are stored at the vertices of the grid and the 
equations are solved on nonoverlapping control volumes that 
surround each node. The viscous terms are evaluated with a 
finite- volume formulation that is equivalent to a Galerkin type 
of approximation for these terms. The solution at each time 
step is updated with the linearized backward Euler time-dif- 
ferencing scheme. At each time step, the linear system of 
equations is approximately solved with either a point implicit 
procedure or the Generalized Minimal Residual (GMRES) 
method. Details of the flux-difference-splitting scheme and 
the time-advancement scheme are given below. 

Finite-Volume Scheme 

The solution is obtained by dividing the domain into a fi- 
nite number of triangles from which non-overlapping control 
volumes are formed by the “dual” mesh described in Refs. 1 
and 5. The inviscid fluxes are evaluated on the faces of the 
control volumes with a flux-difference-splitting scheme simi- 
lar to that used in Refs. 21, 32, 35, and 43. 

The inviscid fluxes on the boundaries of the control vol- 
umes are given by 

* = \(f(q'\n) + f(q ■h))- l -\A\(q + -q ) (7) 

where is the numerical flux, / is the flux vector given in 
Eq. (3), q + and q are the values of the dependent variables 
on the left and right side of the boundary of the control vol- 
ume, and 


\a\ = t\a\t 1 (8) 

where |A| is a diagonal matrix whose elements are the eigen- 
values of the flux Jacobian, A , and are given by 

>^ = 0 

X 2 = ® + c (9) 

X 3 = ®-c 

and 

c = J® 2 + (3 (10) 


The matrices of right and left eigenvectors are given by 
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[A]" = [D]" + [0] 
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where c|> is a shear velocity perpendicular to 0 and equal to 
<|) = n x v-n y u (13) 


The simplest iterative scheme for obtaining a solution to 
the linear system of equations is a Jacobi type of method in 
which all off-diagonal terms (i.e., [O]" {Aq} ) are taken to 
the right-hand side of equation (15) and are evaluated with the 
values of {Aq}' from the previous subiteration level i. This 
scheme can be represented as 

[D}" {Aq}'* 1 = [{r}" -[0]{A 9 }‘] (18) 

The convergence rate of this process can be slow but can be 
accelerated somewhat by using the latest values of {Aq} as 
soon as they are available. This can be achieved by adopting a 
Gauss-Seidel type of strategy in which all odd-numbered 
nodes are updated first, followed by the solution of the even- 
numbered nodes. This procedure can be represented as 


In Eqs. (7)-(12), the ~ represents quantities evaluated with av- 
eraged values of the left and right states. The values of the left 
and right states q and q are evaluated with a Taylor series 
expansion about the central node of the control volume, so 
that the data on the face is given by 

tfface = 9„ode + ( 14 > 

where r is the vector that extends from the central node to the 
midpoint of each edge and Vq is the gradient of the depen- 
dent variables at the node and is evaluated with a least- 
squares procedure. 1-3-7 

Since the right and left eigenvectors given in Eqs. (11) and 
(12) contain the variable c (and therefore |3 ), the steady-state 
solution has a dependency on (3 , where larger values corre- 
spond to increased dissipation. 32 Numerical experiments have 
indicated that this influence is small for values of |3 below 
approximately 100. Also, since large values of (3 correspond 
to large values of c , if (3 is chosen to be very large, there is a 
wide disparity in the magnitudes of the eigenvalues. This dis- 
parity could lead to slow convergence rates in much the same 
manner as when a compressible flow solver is used at very 
low freestream Mach numbers. For these reasons, all the re- 
sults obtained in this paper use a (3 of 10. 

Time-Advancement Scheme 

The time-advancement algorithm is based on the linearized 
backward Euler time-differencing scheme, which yields a lin- 
ear system of equations for the solution at each time step: 

[• M n {Aq }" = {/•}" (15) 

where {/•}" is the vector of steady-state residuals, {Aq} 
represents the change in the dependent variables, and 
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The solution of this system of equations is obtained with ei- 
ther a fully vectorizable point implicit Gauss-Seidel 
procedure 1-3 or a preconditioned GMRES procedure. 37 

When using the Gauss-Seidel procedure, the solution of the 
linear system is obtained by a relaxation scheme in which 
{Aq}" is obtained through a sequence of iterates { A^r }' 
that converge to {A<jr}" . To clarify the scheme, [A]" is first 
written as a linear combination of two matrices that represent 
the diagonal and off-diagonal terms: 


[£>]"{A 9 }' +i = [{/•}"- [0]{A<jr} ( ' + 1)/ '] (19) 

where { Aqr } < ' + n - ' j s t jj e m0 st recent value of Aq , which 
will be at subiteration level i + 1 for the odd-numbered nodes 
that have been previously updated and at level i for the even- 
numbered nodes. 

Although the use of this algorithm offers improvement over 
the Jacobi iteration strategy, the convergence of the linear 
system can still be slow, particularly on fine grids. Fortu- 
nately, full convergence of the linear system is not necessary 
to provide a robust algorithm that remains stable at time steps 
much larger than an explicit scheme. In fact, if the residual is 
not linearized accurately, then solving the linear system be- 
yond truncation error of the non-linear equation is a waste of 
resources because the Newton type of convergence that is 
normally obtained as the time step is increased is lost. Numer- 
ical experiments over a wide range of test cases for both vis- 
cous and in viscid flow indicate that 15-20 subiterations at 
each time step is adequate. The memory requirements for this 
scheme are dominated by the storage of the flux Jacobians as- 
sociated with the linearization of the fluxes on each edge. In 
the present implementation, two matrices are stored on each 
edge and are associated with the linearization of the flux with 
the states on the right and left sides of the face. In two dimen- 
sions, each matrix is 3 x 3 so that a total of 18 storage loca- 
tions are required for each edge. In three dimensions, the ma- 
trices are each 4 X 4 so that 32 storage locations are required 
for every edge in the mesh. In equation (19), the multiplica- 
tion of the off-diagonal terms in the matrix by the correspond- 
ing values of {Aq} is computed by looping over the edges 
in the mesh and multiplying the flux Jacobians by the current 
values of { A^r } . Note that when the Gauss-Seidel scheme is 
used as described above, only the dependency of the flux on 
the nodes that lie at each end of an edge are included; thus, the 
linearization of the second-order residual is only approximate 
and would only be exact if the flux were computed with a 
first-order-accurate scheme. The convergence of the subitera- 
tive procedure is greatly enhanced by smaller time steps, 
which results in larger diagonal contributions. Therefore, a 
compromise must be made to allow Courant-Friedrichs-Lewy 
(CFL) numbers that are small enough for good convergence 
of the linear system but large enough to provide good conver- 
gence of the nonlinear system. Experiments have shown that 
although computations with CFL numbers of 500 or more re- 
main stable, the best convergence in terms of computer time 
for the Navier-Stokes equations is achieved for more moder- 
ate CFL numbers between 100 and 200. 

When the Gauss-Seidel scheme is used, practical applica- 
tion has shown that replacing the exact linearization of the 
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fluxes with an approximate linearization can provide a signif- 
icant increase in robustness, particularly on highly stretched 
grids used for turbulent flow calculations. This increase in ro- 
bustness is due to the loss of diagonal dominance often asso- 
ciated with the exact linearizations. For this reason, when the 
Gauss-Seidel scheme is used, the linearizations are based on 
linearizing Eq. (7) with |a| treated as constant matrix. 

As an alternative to the above procedure, the GMRES 37 
method can also be used. Note that this procedure only re- 
quires the formation of the product of [A] with a column 
vector and does not require the explicit inversion of [A] . 

In the present work, three methods are used to form the ma- 
trix-vector product required for GMRES. In the first method, 
the flux Jacobian matrices, stored on each edge, are formed 
from the data that lies at the end point of the edge. This basic 
procedure is the same as that described above for the Gauss- 
Seidel algorithm and is equivalent to an exact linearization of 
a spatially first-order-accurate scheme. 

The second technique that can be used to form the matrix- 
vector product is the use of a finite-difference approach 4,23 29 


Av, 


r(q + e\j)-r(q) 


( 20 ) 


where r(q + ev ,) is the residual evaluated by using per- 
turbed state quantities. In this study, e is a scalar quantity 
chosen so that the product of e with the root mean square 
(RMS) of Vj is the square root of “machine zero:” 


e - ^z/\\Vj\\ RMS (21) 

The choice of £ is based on keeping the perturbation to the 
dependent variables in Eq. (20) at a small and consistent level 
independent of the size of the mesh. Note, however, that the 
value of £ will not necessarily be small but will actually in- 
crease as the mesh size increases. This relationship can be 
seen by examining the variation of £ with increasing mesh 
size. Because the norm of Vj is always unity, the RMS value 
of v ; is determined solely by the inverse of the number of un- 
knowns in the mesh. In this case, as the mesh size increases, a 
corresponding decrease occurs in the size of each element of 
v j ; as a result, as the mesh size gets larger, a corresponding 
increase occurs in the magnitude of e . The selection of £ in 
this manner is computationaly efficient and is much more ef- 
fective than chosing a technique that results in a small value 
of £ . In the latter case, practical application has shown that 
the level of convergence that can be obtained depends greatly 
on the size of the mesh and often fails to converge to machine 
zero. 31 This may be attributable to the fact that, because £ is 
forced to be small, the size of the perturbation decreases as 
the mesh size increases. Eventually, the perturbation is essen- 
tially zero so that the matrix-vector product that is computed 
with Eq. (20) is inaccurate. By computing £ with Eq. (21), 
consistent convergence to machine zero is obtained indepen- 
dent of the mesh size. Note that for the incompressible equa- 
tions, the values of q are reasonably well scaled in that the 
size of a typical element of q is order one. If the magnitude of 
these variables is substantially different, then a more appro- 
priate choice of £ would be to require the product of £ with a 
typical size of an element of Vj to be roughly the square root 
of machine zero multiplied by a typical size of an element of 
<7 

In Eq. (20), if the computation of the residuals is the same 
as that used for the right-hand side of Eq. (15), then the result- 
ing matrix-vector product will match that obtained by using 
the exact linearizations of the second-order system, to within 
round off. If the same procedure is used, then the vectors 
computed in the Krylov subspace are essentially equal to 
those computed using the full linearization of the higher-order 


residuals; however, the need to compute and store the matrix 
is eliminated. 

The final method used to evaluate the matrix-vector prod- 
uct was introduced recently by Barth 8 . In this method, the ele- 
ments of the flux Jacobians are still stored along each edge in 
the same manner as when the exact linearization to the first- 
order scheme is used. However, rather than simply forming 
the linearizations from the data at the nearest neighbors, the 
flux Jacobians are formed from data that are extrapolated to 
the cell faces with a least-squares linear reconstruction proce- 
dure. In addition, the elements of Vj are “reconstructed” with 
the same linear reconstruction procedure. Using the method 
of Ref. 8, the effect of the exact linearization of the second- 
order spatial residual is computed by using the same amount 
of storage that is required for the linearizations of the first- 
order scheme. 

The preconditioning step for the GMRES procedure is done 
with one or more iterations of a point Gauss-Seidel procedure 
or an incomplete lower/upper (LU) decomposition 20 in which 
no fill in is allowed (i.e., ILU(O)). Note that only one iteration 
of the Gauss-Seidel procedure is equivalent to “block diago- 
nal” preconditioning. In all cases, the preconditioning is ap- 
plied to the left. When GMRES is used with ILU(O) as the 
preconditioner, the nonzero terms in the matrix are stored in a 
compressed-row storage format, 17 and the nodes in the mesh 
are reordered with a reverse-Cuthill-McKee algorithm 16 to 
cluster nonzero terms along the diagonal. In addition, the for- 
ward and back substitution steps, which must be conducted 
each time the preconditioner is applied, have been fully vec- 
torized with a level-scheduling algorithm. 38 Vectorization is 
accomplished by keeping a list of all edges that contribute to 
the nodes in a given level and coloring those edges to allow 
vectorization. Numerical experiments with the level-schedul- 
ing algorithm indicate that the computer time required for the 
forward and backward substitutions is reduced by a factor of 
approximately 3.3 in two dimensions and by a factor of ap- 
proximately 2.8 in three dimensions. A similar process has 
been used in Ref. 5 1 . 

The memory required for each of the above methods of 
solving the linear system is an important consideration for the 
practical usability of the schemes. As already discussed, the 
largest demand on memory for the Gauss-Seidel scheme 
comes from the storage of the Jacobians on each edge. For the 
GMRES algorithms, the storage can vary significantly, de- 
pending on what methodology is used for computing the ma- 
trix-vector product and what type of preconditioning is used. 
When one or more iterations of the Gauss-Seidel scheme are 
used for preconditioning, the Jacobians stored along each 
edge can be used for both the matrix-vector product and the 
preconditioning step. Therefore, this part of the overall stor- 
age requirement is the same as for the Gauss-Seidel scheme 
(used alone) in that the Jacobians are essentially stored only 
once. Note that when ILU(O) is used as a preconditioner, the 
flux Jacobians that contribute to the global matrix [A] 
(which is subsequently decomposed into approximate lower 
and upper triangular matrices) are formed with the same stor- 
age requirements as required with only nearest neighbors. In 
this way, even when the matrix-vector product matches that 
obtained by linearizing the second-order spatial discretiza- 
tion, the preconditioner corresponds to a lower order linear- 
ization. With the exception of the finite-difference technique, 
the use of ILU(O) preconditioning requires that the elements 
of the matrix be stored essentially twice; once to compute the 
matrix-vector product and once for the incomplete LU de- 
composition. Additional storage is required to store the vec- 
tors in the Krylov subspace and is given by the dimension of 
the subspace times the total number of unknowns in the mesh. 
Although this storage can be nontrivial with a large Krylov 
subspace, it is typically approximately one-third of that re- 
quired for the storage of the nonzero matrix elements. 


Boundary Conditions 

The boundary conditions on the wall correspond to tan- 
gency conditions for inviscid flows and to no-slip conditions 
for viscous flows. In the far field, a locally one-dimensional 
characteristic type of boundary condition is used, similar to 
that described in Refs. 32 and 44. By considering the linear- 
ized inviscid one-dimensional equations (where x is assumed 
to be the coordinate normal to the boundary), 

— + A^y- = 0 (22) 

at ax 

where A = ■=- — 

dq 

Equation (22) can be diagonalized using a similarity transfor- 
mation to yield a decoupled system of equations 

^ + A^ = 0 (23) 

dt dx 

where w represents a vector of characteristic variables 

<Mp + ® o 0) - 

P + (® 0 + c a )& ■ (24} 

p + (0 o -c o )0 

The second eigenvalue 0 + c is always positive and if it is 
assumed that the normal to the far-field boundary points out- 
ward, w 2 is the same on the boundary as in the interior of the 
mesh. In a similar manner. 0 - c is always negative; there- 
fore, w 3 is the same on the boundary as in the free stream. 
The relationship between the value of wq on the boundary de- 
pends on whether or not the flow is into or out of the domain; 
for inflow, the value on the boundary is the same as in the free 
stream; for outflow it is the same on the boundary as in the in- 
terior. These relationships provide three equations in three un- 
knowns that can be solved for the pressure, the normal veloc- 
ity, and the tangential velocity on the far-field boundary: 

<l> 0 ‘t’A ~^o Pb <boPr + 'bo®o®r-cZ < l>r 
1 0„ + c a 0 ® b = p, + (0 o + c o )0, (25) 

1 0 O - c a _<!>*_ _ Poo + (0„-c„)0. _ 

where the subscript r on the right-hand side of Eq. (25) refers 
to data taken from outside the domain for inflow and front in- 
side the domain for outflow. Also, the subscript i indicates 
data taken from inside the domain, and °o indicates data 
taken from outside the domain, which includes a point-vortex 
correction to account for lift. 45 In the current study, note that 
the values taken as reference conditions (those taken as con- 
stant in obtaining Eq. (25)) are evaluated at free-stream condi- 
tions to facilitate the linearization of the fluxes on the far-field 
boundary. 

For all boundary nodes, both on the solid boundaries and in 
the far field, the boundary conditions are not explicitly set but 
are obtained through the solution process in the same manner 
as the points interior to the domain. The only distinction be- 
tween boundary nodes and an interior node is that the enforce- 
ment of the boundary condition is reflected in the flux calcu- 
lation on the boundary and the appropriate linearization is 
taken into account on the left-hand side of Eq. (16). In this 
way, a fully implicit treatment of the boundary conditions is 
achieved. 


Convergence Acceleration Techniques 

To accelerate the convergence to a steady state, a multigrid 
algorithm is employed. 10 The algorithm is similar to that in 
Ref. 28 in that a full approximation scheme 11 is employed, the 
coarser grids are not directly obtained from to the finest one, 
and both V and W cycles can be used. The primary difference 
between the present implementation and that of Ref. 28 is at 
the boundaries for the interpolation of variables from one grid 
to another. In the present implementation, nodes that lie “in- 
side” a body such as an airfoil, as well as those contained in a 
tagged set of nodes near the surfaces, are translated to main- 
tain the distance to a wall instead of relying on an underlying 
structured grid to obtain the necessary translations. Further 
details of the present implementation can be found in Ref. 10. 

Turbulence Modeling 

For the current study, the one-equation turbulence model of 
Spalart and Allmaras is used. 41 At each time step, the equation 
for the turbulent viscosity is solved separately from the flow 
equations, which results in a loosely coupled solution process 
that allows for the easy interchange of new turbulence mod- 
els. The equations are solved with a backward Euler implicit 
scheme similar to that used for the flow variables. For the ap- 
plications in the current work, the linear system is solved at 
each time step by using 12 subiterations of the Gauss-Seidel 
procedure. Following the recommendations of Ref. 41 the lin- 
earizations of the production and destruction terms should be 
modified to ensure positive eddy viscosity throughout the 
computation. The modification eliminates the possibility of 
obtaining Newton-type convergence for the turbulence model. 
Although this problem can possibly be remedied by using the 
full linearizations in the later stages of convergence, in the 
current work the modifications to these terms are kept intact 
throughout the entire computation. On solid surfaces, the de- 
pendent variable (related to the eddy viscosity) is set to zero; 
in the far field, it is extrapolated from the interior for the out- 
flow and taken to be free stream for the inflow. For the spatial 
discretization, first-order upwind differencing is used for the 
convective terms, and the higher order derivatives are evalu- 
ated in the same manner as for the flow solver. The gradients 
required for the production terms are not evaluated with the 
least-squares procedure; rather. Green’s theorem is used. 
Green’s theorem is used because numerical experiments have 
shown that although the least-squares procedure is essential 
for accurately determining data on boundaries of control vol- 
umes for stretched grids, its use for computing actual gradi- 
ents can be inaccurate. 1 Failure to properly evaluate these 
terms often leads to an inaccurate calculation of the eddy vis- 
cosity. 

Two-Dimensional Grid Generation 

Before proceeding to the results, a brief description of the 
methodology used for computing both viscous and inviscid 
grids in two dimensions is given. The two-dimensional grids 
for this study were constructed with an in-house grid-genera- 
tion program known as TRI8IT. 34 This program triangulates a 
multiply-connected domain using an incremental point inser- 
tion and a local edge-swapping algorithm. The TRI8IT pro- 
gram is capable of generating grids suitable for both inviscid 
and viscous CFD applications because it can generate both 
isotropic and highly stretched triangles. 

The TRI8IT grid-generation process starts by defining the 
boundaries of the computational domain. Domain boundaries 
are characterized by simple closed curves that are composed 
of one or more segments; each segment is a smooth curve that 
can be splined independently. The boundaries are defined in 
an input file by a list of sequential grid points or by a list of 
sequential knots for a parametric cubic spline. When a list of 
knots is specified, grid points are smoothly distributed along 
splined segments of the boundary curve with user-specified 
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parameters to control the point distribution. The spacing pa- 
rameters for each segment consist of spacing values at se- 
lected knot locations and an integer value that specifies the 
total number of points for the segment. Together, these pa- 
rameters provide control over the boundary point distribution. 
Once the boundary point distribution has been determined, a 
single large triangle, which is used as the initial triangle for 
the triangulation algorithm, is generated to encompass the en- 
tire domain. By using the single triangle as the initial triangu- 
lation, boundary points along each segment are sequentially 
inserted into the triangulation by connecting grid lines from 
the point to the vertices of the triangle in which the point is lo- 
cated. After the point is inserted, edge swapping is performed 
to locally optimize the triangulation around the new grid 
point. In the present algorithm, the optimization criterion is 
based on interior angles of neighboring triangles; edges are 
swapped to minimize the maximum angles in the local trian- 
gulation. This local optimization procedure is discussed in 
more detail by Barth in Ref. 9. Figure 1 illustrates the point 
insertion and local edge-swapping algorithm. When boundary 



(a) Insert point, (b) Connect vertices, (c) Swap edges. 


Fig. 1. Point insertion and local edge swapping. 


grid points are inserted, edge swapping is used to align grid 
edges with the boundary curve. After the domain boundaries 
are inserted, triangles outside the computational domain, such 
as triangles inside the airfoil boundaries or outside the far- 
field boundary, are removed from the grid. Figure 2 shows a 
view of the triangulation near the surface of the airfoil before 
and after the triangles have been removed. After undesirable 



(a) Before. (b) After. 

Fig. 2. Triangles inside airfoil boundaries are removed. 


triangles have been excluded from the grid, the next step of 
the grid-generation process is to insert field points to produce 
a grid of isotropic triangles. The TRI8IT program provides 
several techniques to generate field points. One technique is 
the approach of Holmes and Snyder; 22 in this approach, the 
field points are inserted into a triangulation to continually re- 
duce the cell aspect ratios. Figure 3(a) shows a grid generated 
with the technique of Holmes and Snyder. Although grids 
generated with this approach are sometimes coarse, they pro- 
vide a sufficient framework for constructing contours that are 
used to generate the stretched triangulations. 


Stretched triangulations are constructed once the domain 
has been discretized into triangles. The process for generating 
a stretched triangulation involves several steps. The first step 
uses the current triangulation (of isotropic triangles) as a 
framework for constructing contours of a field variable 
(Fig. 3(b)). In this application, the field variable contoured is 
a measure of the distance from the field point to the nearest 
point on the airfoil surface. Level curves (contours) of the dis- 
tance field variable are constructed with three user-specified 
parameters, which govern the geometric stretching and spac- 
ing between contour levels. The three specified parameters 
are: normal spacing at the airfoil surface Ar| 0 , outer boundary 
distance, and the total number of level curves (contours). 
With these user-specified parameters, a stretching value r is 
determined for a geometric stretching function in which the 
stretching value controls the spacing between contour levels. 
For example, the spacing Aq between contour levels r| ; and 
r| i+ j is controlled by 


Ar|; + 1 = '-Ar|,- 

where i is an integer number for each level curve that in- 
creases incrementally from zero at the surface to N at the 
outer boundary. Figure 4 illustrates the geometric stretching 
between the distance function contour lines. After the level 



(a) Isotropic triangulation. (b) Level curves. 


Fig. 3. An isotropic triangulation and distance function 
contours for advanced EET airfoil. 



curves have been determined, the triangulation of isotropic 
triangles is discarded and a new triangulation of stretched tri- 
angles is started. In a manner similar to the generation of the 
discarded triangulation, airfoil boundary points are inserted 
into an initial single triangle that encompasses the domain. 
After the airfoil points have been inserted, field points are in- 
serted along level curves by projecting points outward from 
one level curve to the next. This projection process begins at 
the airfoil boundary ( T| 0 ) and ends at the last level curve 
(q w ). The projection process involves construction of an out- 
ward-pointing normal vector for each grid point on the current 
level curve (T|, ). This normal vector is used to project the grid 
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point from the current level curve to the next level curve 
(T | 1 + 1 ), where the location of the projected point is deter- 
mined by the point of intersection of the normal vector and 
the next level curve. After all points have been projected to 
the T| ; + | level curve, the smoothness of the point distribution 
along the T| ; + 1 level curve is evaluated. If the spacing be- 
tween a pair of points along the curve is large in comparison 
with the spacing between neighboring pairs of points, addi- 
tional points are added in the coarse region. Similarly, if the 
spacing between a pair of points is small in comparison with 
the spacing between neighboring pairs of points or small in 
comparison with the spacing between level curves Ar|, + 1 , 
then points will be removed. Only after a smooth distribution 
along the level curve is obtained will the potential grid points 
be inserted into the new triangulation. Thus, after points are 
projected from the airfoil boundary to the first level curve, 
they are in turn projected outward to the next level curve and 
inserted. This process continues until the last level curve is 
reached. As seen in Fig. 5, the resulting grids obtain a very 
“structured" appearance. 

A Perl 53 script is used to automate the entire process, in- 
cluding the generation of the sequence of coarser grids for 
multigrid applications. After splining the surfaces, the user is 
prompted to input the normal spacing at the airfoil boundary, 
the distance to the outer boundary, the number of level curves, 
and the number of coarser grids desired. The approach 
adopted for generating the coarser grids involves the removal 
of every other grid point from the boundary distribution and 
of every other level curve from the distance function con- 
tours. Figure 5 shows a sequence of three grids for the ad- 
vanced Energy-Efficient Transport (EET) airfoil. 

Results 

Results are presented below for four cases. The first case is 
an inviscid case for which an exact solution exists. This case 
is used to compare the convergence rate of various options in 
the code. The remaining three cases include two two-dimen- 
sional viscous test cases, as well as an initial result for three- 
dimensional viscous computations. For each of the tests cases, 
the value of the artificial compressibility parameter is set to 
10. All results have been obtained on a Cray Y/MP computer 
located at the NASA Langley Research Center, with the ex- 
ception of the three-dimensional case, which utilized the Cray 
C-90 located at the Numerical Aerodynamic Simulator 
(NAS). 

Four-Element Airfoil of Suddhoo-Hall 

The first case considered is an inviscid flow over a four-el- 
ement airfoil for which an exact potential flow solution is 


available. 40 This case is used to examine the convergence be- 
havior of many of the available options to evaluate the effi- 
ciency in terms of both computer time and memory. The 
many possible combinations of options (e.g., multigrid, mesh 
sequencing, techniques for solving the linear system, methods 
for forming the matrix-vector product, and preconditioning) 
make selection of the “best” strategy for all cases difficult, if 
not impossible. However, the intent here is only to examine 
some effects of different parameters on the solution time and 
the memory required in order to arrive at a good strategy that 
can be used successfully for a wide array of cases. 

Figure 6 shows the grid used for this case, which consists 
of 25,862 nodes and 50,213 triangles, with 512 nodes on the 
surface of the main element and 312 nodes on the surface of 
each of the remaining elements. The grid has been generated 
with the method described previously. 



The computed pressure distribution is compared with the 
exact incompressible solution in Fig. 7. The agreement be- 
tween the computed and the exact solution is good for each of 
the elements. 

As previously mentioned, the solution for this case has 
been obtained with several variations of input parameters. 
These parameters include the technique used to solve the lin- 
ear system, multigrid acceleration, mesh sequencing, and var- 
ious other parameters necessary for use with GMRES (e.g., 
the dimension of the Krylov subspace, the tolerance for solv- 
ing the linear system, and the number of cycles to apply). An 



(a) Fine. (b) Medium. (c)Coarse. 

Fig. 5. Sequence of three grids for Advanced EET airfoil. 
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Fig. 7. Comparison of computed and exact pressure distribution 
for four-element airfoil of Suddhoo and Hall. 

overall summary of the results is given in Table 1. Here, and 
in the discussion that follows, each set of parameters is re- 
ferred to by a case number that is given in the first column of 
the table. The results are primarily organized according to the 
technique used to obtain an approximate solution to the linear 
system. For the calculations obtained with GMRES, the re- 
sults are then grouped according to how the matrix-vector 
product is calculated. When GMRES is used, additional infor- 
mation is supplied in regard to the number of search direc- 
tions, the convergence tolerance for the linear system, the 
number of GMRES cycles, and the type of preconditioner em- 
ployed. For all calculations given in the table, the CFL num- 
ber has been either ramped linearly over 50 iterations or is 
tied to the change in the pressure so that as the solution con- 
verges the CFL number increases proportionally. In all cases, 
however, the maximum CFL number is limited to that shown 
in the table. In addition, although most cases were run to ma- 
chine zero, for the present study the time required for conver- 
gence is considered to be that necessary to achieve a 6-order- 
of-magnitude reduction in the residual. 

Figures 8 and 9 are convergence histories for a few selected 
results from the table. Figure 8 shows convergence for cases 
in which the exact linearizations of the fluxes are not used. 
These correspond to cases 1-13 in the table and are referred to 
here as non-Newton type schemes. Results for which the 
exact linearizations are used correspond to cases 14 through 
22 and are referred to as Newton-type schemes. Note that, al- 
though formation of the matrix- vector product with the finite- 
difference methodology is not exact because of round-off er- 
rors, the matrix-vector product is considered to be exact for 
the present putposes. 

In Fig. 8, several results for the non-Newton schemes are 
shown. Here, the single-grid (nonmultigrid) results for which 
the point iterative method is used to solve the linear system at 
each time step are referred to as the “baseline” scheme, de- 



Fig. 8. Convergence history for non-Newton schemes for four- 
element airfoil of Suddhoo and Hall. 
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Fig. 9. Convergence history for Newton-type schemes for four- 
element airfoil of Suddhoo and Hall. 

noted in the table as case 1. For this scheme, 15 Gauss-Seidel 
subiterations are used at each global time step, and the CFL 
number is linearly ramped from 10 to 200 over 50 global iter- 
ations. The residual for this case drops 6 orders of magnitude 
in about 540 iterations and takes approximately 600 seconds 
on the Cray Y/MP 

Also shown in the figure are results obtained with one cycle 
of GMRES with ILU preconditioning, where the dimension 
of the Krylov subspace is 12 and the tolerance is set to 0.1. 
The CFL number for this case has been allowed to go as high 
as 50,000 and is increased as the solution converges. Here, the 
residual drops much faster; only 142 iterations are required to 
obtain the convergence criteria. For this case, the computer 
time is reduced over the baseline scheme; approximately 419 
seconds are required to reach the convergence criteria. In the 









table, results are shown in cases 5 and 6 that are identical to 
the previously described case, except that a smaller tolerance 
is placed on solving the linear system and more GMRES cy- 
cles are allowed to achieve the specified level of convergence 
of the linear system. A comparison of cases 5 and 6 with case 
12 clearly shows-that a more converged solution to the linear 
system requires a substantial increase in the time needed to 
reach convergence. Because in these cases the linearization of 
the residual is approximate, the fast convergence associated 
with Newton’s method is lost; as a result, using large time 
steps and obtaining a good level of convergence of the linear 
system are a waste of time. 

Also shown in Fig. 8 are results obtained with multigrid ac- 
celeration. Cases 2 and 4 indicate results obtained with a 
three-level V cycle and a four-level W cycle, respectively. 
Here, the linear system is solved at each time step with the 
Gauss-Seidel scheme. For the three-level V cycle, 15 subiter- 
ations of the Gauss-Seidel iterative scheme are used, whereas 
for the W cycle only 5 subiterations are used. In general, the 
number of subiterations can be reduced with a W cycle. This 
trend is also observed when more coarser grids are used. For 
both the V-cycle and W-cycle cases, the computer time re- 
quired to reduce the residual by 6 orders of magnitude is sig- 
nificantly decreased over the baseline scheme; the V cycle re- 
quires 158 seconds, and the W cycle requires only 70 seconds. 
Similar results are also shown in the figure for the case in 
which GMRES with ILU preconditioning has been used in 
conjunction with multigrid. The number of iterations to reach 
convergence is less for this case than for the three-level V cy- 
cle; however, as seen in the table approximately 40 percent 
more computer time is required. 

In Fig. 9, results are shown for schemes in which the exact 
linearizations are used. The residual history with the finite- 
difference methodology for forming the matrix-vector prod- 
uct and that of Ref. 8 are identical. This is not surprising be- 
cause each method is essentially exact. Note that in cases 15 
and 19, although the number of iterations required for conver- 
gence is substantially reduced over the baseline scheme, the 
computer time required is somewhat higher. For these cases, 
approximately 20 global iterations are required to obtain an 
initial 2-order-of-magnitude reduction in the residual. A fur- 
ther reduction of 4 orders of magnitude requires only 3 to 5 it- 
erations because fast convergence is obtained when the solu- 
tion is close enough to the root. To reduce the time required, 
mesh sequencing has been used, where 5-iterations are con- 
ducted on a coarse grid of only 2052 nodes. This solution is 
then interpolated to a finer grid of 7044 nodes, where 10 addi- 
tional iterations are done. Finally, the solution is interpolated 
to the finest grid, on which the computation is concluded. The 
effect of this in terms of iterations is seen from Fig. 9 to be not 
particularly dramatic. However, the computer times listed in 
the table indicate that a factor-of-3 reduction in computer time 
can be achieved by doing more of the initial work on the 
coarser grids. 

The last curve shown in Fig. 9_represents case 22, in which 
the exact linearizations are used in conjunction with GMRES 
and ILU preconditioning as well as with multigrid accelera- 
tion. For these calculations, 12 search directions are used, and 
the tolerance on the linear system is only 0.1. The figure 
shows that significantly fewer iterations are required to obtain 
a 6-order-of-magnitude reduction in the residual over the non- 
multigrid results, although “machine-zero” is achieved in ap- 
proximately the same number of iterations as before. Table 1 
indicates that the computer time required is 90 seconds, which 
is substantially less than the time required for the solutions 
obtained with mesh sequencing. Note that case 17 in the table 
is similar to case 22, except that the matrix- vector product has 
been formed with the finite-difference method given in 
Eq. (20). However, the maximum CFL number in this case is 
only 10,000 rather than 50,000, which was used in case 22. 


This is because when a CFL number of 50,000 is used, the 
convergence of the finite-difference method stalled. Although 
this technique requires less memory than the method of 
Ref. 8, it appears to be somewhat more sensitive to parameter 
variations. 

Based on the results in Table 1 and Figs. 8 and 9, several 
conclusions can be reached. First, the performance of 
GMRES can depend greatly on the choice of parameters. 
Also, the Newton schemes require fewer iterations to obtain 
convergence when compared with the non-Newton schemes. 
However, by examining the computer times in the table, the 
fastest convergence in terms of iterations does not necessarily 
correspond to the lowest computer time. In all cases, regard- 
less of the technique that is used to solve the linear system and 
whether exact linearizations are used, the use of coarser grids 
either through mesh sequencing or multigrid acceleration of- 
fers a reduction in computer time over any method in which 
only one grid is used. In addition, although fast convergence 
in terms of both iteration count and computer time can be ob- 
tained with the full linearization in conjunction with GMRES, 
this technique does not result in less computer time than the 
simpler combination of multigrid in which the Gauss-Seidel 
scheme is used to obtain an approximate solution to the linear 
system. In addition, the use of GMRES comes at the expense 
of increased memory over the simpler scheme. For this rea- 
son, the remaining results shown in this paper are all obtained 
using the Gauss-Seidel scheme and multigrid acceleration. 

Although not shown, similar experiments to those de- 
scribed above have been conducted for both laminar and tur- 
bulent flows with no significant change in the results. For tur- 
bulent flows, however, only the non-Newton schemes have 
been investigated because the turbulence model is decoupled 
from the flow equations so that convergence rates associated 
with Newton’s method are not possible. Also, multiple grids, 
in which different techniques are used for each grid, are im- 
plemented in the codes; however, this technique has not been 
investigated in depth. However, as previously mentioned, the 
reason for examining the different schemes is to arrive at a 
methodology that achieves a good balance between memory 
and computer time and can be used for practical problems. 

NACA 4412 Airfoil 

The next case considered is the viscous flow over a NACA 
4412 airfoil; the results are compared with the experimental 
data obtained in Ref. 15. The flow conditions include an angle 
of attack of 13.87°, a Reynolds number of 1.52 million 
(based on the chord length of the airfoil), and a free-stream 
velocity for the test of approximately 60 m.p.h. 
(37^ = 0.07) . Comparisons of the computed results and ex- 
perimental data are shown in Figs. 10 and 1 1 for computa- 
tions with the present (incompressible) code and with a well- 
known compressible code (CFL3D) 46 run at a free-stream 
Mach number of 0.2 . The grid used for the unstructured in- 
compressible computations consists of 22,595 nodes with a 
spacing at the wall of approximately 5 X 10~ 6 normalized to 
the chord of the airfoil; a partial view is shown in Fig. 12. 
The grid used for the computation with CFL3D is a 361 x 1 13 
C mesh with similar spacing at the wall. Figure 10_shows a 
comparison of the computed and experimental pressure distri- 
butions. The computations generally agree well with each 
other; however, a discrepancy with the experimental data oc- 
curs toward the aft end of the airfoil. Velocity profiles are 
shown in Fig. 1 1 for the two computations, as well as for the 
experimental results. Again, the computations agree well but 
show a discrepancy with the experimental results in the sepa- 
rated region toward the aft end of the upper surface. Compu- 
tations have been conducted with the same turbulence model 
in Refs. 30 and 50 that show similar results. 
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Fig. 10. Comparison of computed and experimental pressure dis- 
tributions for NACA 4412 airfoil with a = 13.87 ° and 
Re = 1.52 x 10 6 . 
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Fig. 11. Velocity profiles for NACA 4412 airfoil with 
a = 13.87° and Re = 1.52 x 10 6 . 

Advanced Energy Efficient Transport (EET) 3-Element Airfoil 

The last two-dimensional case examined is the flow over a 
three-element airfoil that has undergone extensive testing in 
the Low Turbulence Pressure Tunnel (LTPT) located at 
NASA Langley Research Center. 2 - The cases considered here 
examine the influence of the free-stream Mach number on so- 
lutions over a wide range of angles of attack and the suitabil- 
ity and consequences of assuming incompressibility. In the 
following results, computations are obtained with the incom- 
pressible and the compressible code. The results are com- 
pared with experimental data at two free-stream Mach num- 
bers to examine the effects of compressibility and to assess 
the ability of the codes to accurately predict trends due to 
Mach-number variations. For the first Mach number of 0.15, 


the assumption of incompressibility is expected to be accept- 
able; at the relatively high Mach number of 0.26, compress- 
ibility effects are quite important. All results are obtained for 
a Reynolds number of 9 x 10 6 . 

The fine grid used for the computations consists of 70,686 
nodes with normal spacing at the wall of 2 X 10 , based on a 

reference chord of the airfoil with the elements retracted. This 
grid was generated with the same procedure used for the 
NACA 4412 airfoil, which is shown in Fig. 5. The conver- 
gence history for the incompressible code in terms of CPU 
time on a Cray Y/MP is shown in Fig. 13 for the four com- 
puted angles of attack of 0° , 8° , 16° , and 22° . For each re- 
sult, a three-level V cycle has been used, and the CFL number 
has been linearly ramped from 10 to 200 over the first 100 it- 
erations. Although not shown, the convergence histories for 
the compressible code are similar but the compressible code 
requires approximately 30 percent more computer time. This 
difference is simply because three equations are solved with 
the incompressible code; the compressible code solves four 
equations. In addition, for the compressible code at a Mach 
number of 0.26, a flux limiter was used at the highest angle of 
attack because of a reasonably strong shock wave on the slat. 
The figure shows that the computer time required to obtain 
steady lift increases with angle of attack; the time ranges from 
approximately 12 minutes for the lowest angle of attack to 
about 30 minutes for the highest. For the two lower angles of 
attack, the residual drops steadily; for the two higher angles, 
the residual essentially stalls. By examining details of the so- 
lution at intervals 50 iterations apart, this has been traced to a 
small level of unsteadiness in the solution located underneath 
the slat, where the upper surface and lower surfaces join. Al- 
though not apparent from the grid shown in Fig. 5, this junc- 
ture is not sharp but has a small finite thickness, as do the 
trailing edges of the slat, main element, and flap. Because of 
the small size of this area, the effect of the unsteadiness on the 
overall lift is only in the fourth significant digit and is, there- 
fore, not noticeable in Fig. 13. 

Pressure distributions on each element are shown for an 
angle of attack of 16° in Fig. 14. A summary of the lift coef- 
ficients on the individual elements as well as for the total con- 
figuration is shown in Fig. 15 for incompressible computa- 
tions compared with both experimental data and compressible 
computations at a Mach number of 0.15. It is seen from the 
figures that the agreement between the computations and ex- 
periment is quite good for both the lift values and the pressure 
distributions. Furthermore, an examination of the pressure 
distributions over the elements shows little difference be- 
tween the incompressible solutions and the compressible so- 
lutions at a freestream Mach number of 0.15. Although differ- 
ences between the incompressible and compressible solutions 
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Fig. 13. Convergence history for incompressible-flow codes for 
three-element airfoil at several angles of attack with 
Re = 9 x 10 6 . 

are difficult to discern from the pressure distributions, the in- 
compressible solution exhibits slightly lower magnitudes in 
the pressure coefficients on the elements. This difference 
leads to a correspondingly lower lift, particularly on the main 
element. Nevertheless, at this Mach number, the incompress- 
ible assumption does not compromise the overall solution in 
terms of the agreement with experimental results. 



Fig. 15. Comparison of experimental lift and computed lift versus 
angle of attack for incompressible and compressible flow solu- 
tions. 

Figures 16 and 17 compare the incompressible and com- 
pressible computations with the experimental data at Mach 
number of 0.15 and 0.26. Fig. 16 shows a dramatic difference 
in the experimental data at the two different Mach numbers. 
Over most of the range of angles-of-attack, the higher Mach 
number conditions yield a higher lift value on the main ele- 
ment with a corresponding increase in the total lift for the con- 
figuration. Flowever. at an angle of attack of 22° , it is appar- 
ent that the experimental lift value is past the angle of attack 
for maximum lift and that this trend has been accurately com- 
puted. An examination of the computational results indicates 
that at the higher free-stream Mach number, a shock wave is 
present on the leading edge of the slat and that the Mach num- 
ber ahead of the shock is approximately 1 .4 . This reasonably 
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Fig. 14. Pressures for three-element airfoil at a Mach number of 0.15 compared with incompressible and compressible computations. 
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ment indicates a lower pressure coefficient for the higher 
freestream Mach number. 



Fig. 16. Comparison of experimental lift versus angle of attack 
withincompressible-andcompressible-flowsolutionsattwodif- 
ferent Mach numbers. 

strong shock causes the flow over much of the upper sur- 
face of the slat to separate and a much thicker wake be- 
hind the slat is obtained over that of both the lower Mach 
number and incompressible computations. Although not 
shown, examination of the skin frictions shows that the 
boundary layer on the main element, as well as the flap is 
fully attached. 

In Fig. 17, the computed incompressible and com- 
pressible pressure distributions are compared with exper- 
imental data at both Mach numbers at an angle-of-attack 
of 16° . The comparison between the computational and 
experimental results is good and the trends are accurately 
predicted. The suction peak on both the slat and main ele- 


Three Dimensions: Wing with Partial Span Flap 

Three-dimensional turbulent computations are shown 
below for the flow about a wing with a partial span flap. This 
geometry has been recently studied experimentally in the 
Ames 7 x 10 wind tunnel 39 with computations reported in 
Ref. 27. For this geometry, the gap and overlap between the 
flap and the wing is g/c = 0.019 and o/c = 0.004 respec- 
tively. Here, g is the distance from the chord line of the main 
wing to the highest point on the flap, o is the distance be- 
tween the leading edge of the flap and the trailing edge of the 
main wing, and c is the reference chord. A depiction of the 
geometry as well as the surface grid used in the calculations is 
shown in Fig. 18. The grid has been generated with the tech- 
nique described in reference 33 includes the wind-tunnel ceil- 
ing and floor as well as the side walls. The grid consists of 
549,176 nodes and 3,179,640 cells with a normal spacing at 
the wall of 1 x 1CF 5 nondimensionalized by the chord of the 
unflapped portion of the wing. The angle of attack is 10° and 
the Reynolds number is 3.7x10 ’ . The computation for this 
case has been performed on the Cray C-90 located at the Nu- 
merical Aerodynamic Simulator (NAS). 

For this calculation, the backward-Euler scheme has been 
used in which the linear system is approximately solved using 
the point Gauss-Seidel method. Multigrid acceleration is not 
currently incorporated into the three-dimensional code and is 
therefore not used. The convergence history for this case is 
not shown, but the residual has been reduced by 3 1/2 orders 
of magnitude in 250 iterations and requires approximately 4.5 
hours of computer time. A comparison of pressure distribu- 
tions at four locations along the wing span is shown in 
Fig. 19. Note that in these comparisons, the flap has been ro- 
tated back to a zero degree deflection and then translated rear- 
ward to separate it from the main wing. The agreement be- 
tween the computed pressure distributions is fairly good for 
all four span stations. The variation in pressure distributions 
due to spanwise location on the wing are well predicted, how- 
ever, it appears that more grid resolution is required for ob- 
taining the suction peaks. 




Fig. 17. Pressure distribution for three-element airfoil at a Mach numbers of 0.15 and 0.26 compared with computational results. 
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Fig. 18. Surface triangulation for wing with partial span flap. 
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Fig. 19. Comparison of span station pressure distributions for 
wing with partial-span flap; Experimental conditions M, = 0.2 , 
a = 10.0° , and Re = 3.7 x 10* . 


Concluding Remarks 

An implicit multigrid code for computing incompressible 
turbulent flows on unstructured grids is described. Results are 
presented to examine the effectiveness of several of the avail- 
able options included in the code and to demonstrate the accu- 
racy of the code for several test cases by comparing with ex- 
perimental data. It is shown that while fast convergence in 
terms of iterations can be obtained using Newton-type solv- 
ers, the most effective way of reducing computer times is 
through the use of multiple grids. When using a Newton-type 
solver, mesh sequencing can be used to obtain an initial solu- 
tion on coarser meshes which is then interpolated to the finest 
mesh. However, multigrid acceleration used in conjunction 
with a simpler “non-Newton” solver that requires signifi- 
cantly less memory exhibits the fastest convergence for the 
test case. Viscous turbulent flow results for the NACA4412 
airfoil are shown and compare well with experimental data 
and with results from a well established structured-grid com- 
pressible code. A study is conducted to examine compress- 
ibility effects for a three-element airfoil by comparing incom- 
pressible and compressible computational results with 
experimental data. The results show that the incompressible 
computations compare well with compressible computations 
as well as experimental data at a low freestream Mach number 
of 0.15. However, when comparing to experimental data at a 
freestream Mach number of 0.26, significant effects due to 
compressibility effect are observed in the experimental data 
which are well predicted using the compressible flow solver 
but are not accounted for using the incompressible assump- 
tion. Finally, three-dimensional turbulent computations are 
shown for a wing with a partial span flap that exhibit good 
agreement with experiment. 

Also described and demonstrated is a two-dimensional grid 
generation procedure for generating good quality highly 
stretched grids for viscous computations. The grid generation 
procedure is automated through the use of a Perl script, and is 
robust and extremely easy to use. 
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Case 

number 

Linear 

Solver 

Matrix- 

Vector 

product 

a 

CFL1/ 

CFL2 

Ramp of 
CFL 

Search 

Directions 

GMRES 

Cycles 

Tolerance 

ILU 

Gauss- 

Seidel 

iterations 

CPU 

Comment 

i 

GS 

NA 

10/200 

linear/50 

NA 

NA 

NA 

NA 

15 

598 

Baseline 

2 

GS 

NA 

10/200 

linear/50 

NA 

NA 

NA 

NA 

15 

158 

3-level V- 
cycle 

3 

GS 

NA 

10/200 

linear/50 

NA 

NA 

NA 

NA 

10 

114 

3-level W- 
cycle 

4 

GS 

NA 

10/500 

linear/50 

NA 

NA 

NA 

NA 

5 

70 

4-level W- 
cycle 

5 

GMRES 

0 

100/50K 

Ap 

12 

10 

0.001 

1 

NA 

1469 


6 

GMRES 

0 

100/50K 

A p 

12 

3 

0.001 

1 

NA 

761 


7 

GMRES 

0 

10/200 

linear/50 

10 

3 

0.1 

0 

1 

660 

Diagonal 

precondition 

8 

GMRES 

0 

10/200 

Ap 

12 

1 

0.1 

1 

NA 

629 


9 

GMRES 

0 

10/200 

linear/50 

10 

3 

0.1 

1 

NA 

626 


10 

GMRES 

0 

10/50K 

Ap 

12 

1 

0.1 

0 

3 

365 

GMRES 

stalled 

11 

GMRES 

0 

10/200 

Ap 

12 

1 

0.1 

0 

3 

519 


12 

GMRES 

0 

10/50K 

Ap 

12 

1 

0.1 

1 

NA 

419 


13 

GMRES 

0 

10/50K 

Ap 

12/12/12 

1/1/1 

0.1 

1 

NA 

226 

GMRES/ 

multigrid 

14 

GMRES 

1 

100/50K 

Ap 

20 

15 

0.001 

1 

NA 

995 


15 

GMRES 

1 

100/50K 

Ap 

12 

10 

0.001 

1 

NA 

793 


16 

GMRES 

1 

100/50K 

Ap 

12/12/12 

3/3/10 

0.001 

1 

NA 

281 

Mesh 

sequencing 
with 3 grids 

17 

GMRES 

1 

20/1 OK 

Ap 

12 

1 

0.1 

1 

NA 

118 

Multigrid/ 

Newton- 

Krylov 

18 

GMRES 

2 

100/50K 

Ap 

20 

15 

0.001 

1 

NA 

762 


19 

GMRES 

2 

100/50K 

Ap 

12 

10 

0.001 

1 

NA 

612 


20 

GMRES 

2 

100/50K 

Ap 

12 

3 

0.001 

1 

NA 

323 


21 

GMRES 

2 

100/50K 

Ap 

12/12/12 

3/3/10 

0.001 

1 

NA 

217 

Mesh 

sequencing 
with 3 grids 

22 

GMRES 

2 

100/50K 

Ap 

12 

1 

0.1 

1 

NA 

90 

Multigrid 
with exact 
linearizations 


Table 1 Effect of varying parameters on computer time required to reduce residual 6 orders of magnitude for4-element airfoil of Suddhoo and Hall. 


a Numbers in this column indicate the following 

0 indicates exact linearization of first-order system (nearest neighbors). 

1 indicates Newton-Krylov method (finite difference of residual). 

2 indicates exact linearizations for higher order system with method of reference 8. 
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